Improved audio coding using a psychoacoustic model based on a cochlear filter bank

نویسنده

  • Frank Baumgarte
چکیده

Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is derived from a psychoacoustic model mimicking the properties of masking. Most psychoacoustic models for coding applications use a uniform (equal bandwidth) spectral decomposition as a first step to approximate the frequency selectivity of the human auditory system. However, the equal filter properties of the uniform subbands do not match the nonuniform characteristics of cochlear filters and reduce the precision of psychoacoustic modeling. Even so, uniform filter banks are applied because they are computationally efficient. This paper presents a psychoacoustic model based on an efficient nonuniform cochlear filter bank and a simple masked threshold estimation. The novel filter-bank structure employs cascaded low-order IIR filters and appropriate down-sampling to increase efficiency. The filter responses are optimized for the modeling of auditory masking effects. Results of the new psychoacoustic model applied to audio coding show better performance in terms of bit rate and/or quality of the new model in comparison with other state-of-the-art models using a uniform spectral decomposition. The low delay of the new model is particularly suitable for low-delay coders.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Binaural cue coding-Part I: psychoacoustic fundamentals and design principles

Binaural Cue Coding (BCC) is a method for multichannel spatial rendering based on one down-mixed audio channel and BCC side information. The BCC side information has a low data rate and it is derived from the multichannel encoder input signal. A natural application of BCC is multichannel audio data rate reduction since only a single down-mixed audio channel needs to be transmitted. An alternati...

متن کامل

A computationally efficient cochlear filter bank for perceptual audio coding

Many applications in auditory modeling require analysis filters that approximate the frequency selectivity given by psychophysical data, e.g. from masking experiments using narrow-band maskers. This frequency selectivity is largely determined by the spectral decomposition process inside the human cochlea. Currently used spectral decomposition schemes for masking modeling in audio coding general...

متن کامل

An Improved Psychoacoustic Model for Audio Coding Based on Wavelet Packet

This paper describes a new design of a psychoacoustic model for audio coding following the model used in the standard MPEG-1 audio layer 3 using an appropriate wavelet packet decomposition of the speech/audio signal. The design of a psychoacoustic model is achieved by wavelet packet decomposition whose connections are selected in such a way that sub bands correspond to the best possible one to ...

متن کامل

Design of the Audio Coding Standards for MPEG and AC - 3

ISO MPEG 1/2 and Dolby AC-3 are widely used in the network, wireless, multimedia system and video industry. This dissertation studies the design of audio standards: MPEG-1/2 and AC-3. The perceptual audio coder like MPEG-1/2 and AC-3 can be analyzed through filterbank, psychoacoustic model, stereo matrix, bit allocation/ quantization, and packing block. This dissertation considers the design fo...

متن کامل

Speech enhancement system for hands-free telephone based on the psychoacoustically motivated filter bank with allpass frequency transformation #

In this paper the application of multirate acoustic echo canceller derived from cochlea model and psychoacustically motivated noise and residual echo attenuation system operating on the signal decomposed in cochlear spaced subbands are described. Polyphase realisation of the FIR filter bank with non-uniform frequency resolution achieved by the frequency transformation of filter characteristic u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Speech and Audio Processing

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2002